Precision Efficacy Analysis for Regression

نویسنده

  • Gordon P. Brooks
چکیده

When multiple linear regression is used to develop a prediction model, sample size must be large enough to ensure stable coefficients. If the derivation sample size is inadequate, the model may not predict well for future subjects. The precision efficacy analysis for regression (PEAR) method uses a crossvalidity approach to select sample sizes such that models will predict as well as possible in future samples. Previous studies have shown the sample sizes suggested by the PEAR method to be superior to other methods in limited cross-validity shrinkage to acceptable a priori levels. A Monte Carlo study was conducted to verify the PEAR method further for the selection of regression sample sizes and to extend the analysis to include an investigation of the effects of multicollinearity on coefficient estimates obtained through multiple linear regression analysis. Appendixes show the derivation of the PEAR method for sample size selection, and give correlation matrices, stem-and-leaf plots, and histograms of cross-validity for the study. (Contains 10 tables, 4 figures, and 116 references.) (SLD) ******************************************************************************** Reproductions supplied by EDRS are the best that can be made from the original document. ********************************************************************************

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

The Precision Efficacy Analysis for Regression Sample Size Method

The general purpose of this study was to examine the efficiency of the Precision Efficacy Analysis for Regression (PEAR) method for choosing appropriate sample sizes in regression studies used for precision. The PEAR method, which is based on the algebraic manipulation of an accepted cross-validity formula, essentially uses an effect size to determine the subject-to-variable ratio appropriate f...

متن کامل

A Comparison of Thin Plate and Spherical Splines with Multiple Regression

Thin plate and spherical splines are nonparametric methods suitable for spatial data analysis. Thin plate splines acquire efficient practical and high precision solutions in spatial interpolations. Two components in the model fitting is considered: spatial deviations of data and the model roughness. On the other hand, in parametric regression, the relationship between explanatory and response v...

متن کامل

The Effect of Transitive Closure on the Calibration of Logistic Regression for Entity Resolution

This paper describes a series of experiments in using logistic regression machine learning as a method for entity resolution. From these experiments the authors concluded that when a supervised ML algorithm is trained to classify a pair of entity references as linked or not linked pair, the evaluation of the model’s performance should take into account the transitive closure of its pairwise lin...

متن کامل

The Relationship between Tertiary Level EFL Teachers’ Self-Efficacy Perceptions and Their Level of Linguistic Proficiency

Teacher self-efficacy has been identified as an important characteristic of teachers that can positively influence both teacher and student outcomes. The relationship between teachers’ self-efficacy and their linguistic proficiency, however, is yet to be investigated. The present study was an attempt to examine the rather under-researched issue of teachers’ level of linguistic competence in the...

متن کامل

Estimation of coal proximate analysis factors and calorific value by multivariable regression method and adaptive neuro-fuzzy inference system (ANFIS)

The proximate analysis is the most common form of coal evaluation and it reveals the quality of a coal sample. It examines four factors including the moisture, ash, volatile matter (VM), and fixed carbon (FC) within the coal sample. Every factor is determined through a distinct experimental procedure under ASTM specified conditions. These determinations are time consuming and require a signific...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012